Import PDB data

pdb_stats <- read.csv("./Data Export Summary.csv", row.names=1)
pdb_stats
##                          X.ray   NMR   EM Multiple.methods Neutron Other  Total
## Protein (only)          142419 11807 6038              177      70    32 160543
## Protein/Oligosaccharide   8426    31  991                5       0     0   9453
## Protein/NA                7498   274 2000                3       0     0   9775
## Nucleic acid (only)       2368  1378   60                8       2     1   3817
## Other                      149    31    3                0       0     0    183
## Oligosaccharide (only)      11     6    0                1       0     4     22

Q1: What percentage of structures in the PDB are solved by X-Ray and Electron Microscopy.

X ray: 87.53% Electron Microscopy: 4.95%

# Find percentages separately 
sum(pdb_stats$X.ray)/ sum(pdb_stats$Total)
## [1] 0.8752836
sum(pdb_stats$EM)/ sum(pdb_stats$Total)
## [1] 0.0494687
# Complete across all columns (i.e. all structural types)
round(((colSums(pdb_stats)/sum(pdb_stats$Total)) *100), 2)
##            X.ray              NMR               EM Multiple.methods 
##            87.53             7.36             4.95             0.11 
##          Neutron            Other            Total 
##             0.04             0.02           100.00

Q2: What proportion of structures in the PDB are protein?

87.35%

round(((pdb_stats$Total[1]/sum(pdb_stats$Total))*100), 2)
## [1] 87.35

Q3: Type HIV in the PDB website search box on the home page and determine how many HIV-1 protease structures are in the current PDB?

23409

Use VMD to explore protein structure

Import protein structure